|Poster:||hans.heiri||Date:||Jul 31, 2017 7:02am|
|Forum:||faqs||Subject:||robots.txt no longer supported to exclude page from being crawled?|
somebody from our legal department told us, that the "internet archive" won't longer support "robots.txt" to exclude websites from being crawled.
in the FAQ section, i've found the note that one has to send an info to email@example.com to "have the page excluded from the wayback machine". but it says nothing about if the "robots.txt" still works or not.
does anyone have new information regarding this topic?
any help appreciated & thanks in advance,
|Poster:||MeditateOrDie||Date:||Jul 31, 2017 11:42am|
|Forum:||faqs||Subject:||Re: robots.txt no longer supported to exclude page from being crawled?|
I'm not aware of any change to that policy.
Note: Not all site indexers/crawlers will obey robots.txt
so if you don't want something accessible, configure
your servers to use permissions or other tricks to
prevent unauthorized access to places which
you'd prefer to protect.